Skip to content

Conversation

@littledgg
Copy link
Contributor

@littledgg littledgg commented Nov 3, 2025

Motivation

Achieving batch invariance in the PaddlePaddle framework.
Batch invariance:https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/

想要跑通需要安装如下内容,paddle必须是比较新的(建议用最新的)

pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/
pip install triton

python tests/batch_invariant/test_batch_invariance.py

如果能看见Batch-Invariant Mode下均为0就代表正确
image
目前只有log_softmax算子尽管精心构造了输入数据,但是在原版实现似乎就已经具备批处理不变性了。

TODO:严格对齐API目前(mm和log_softmax还存在问题),可以考虑把test case整合进一个文件,文件中列出的若干TODO

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Nov 3, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Nov 3, 2025
@gongshaotian gongshaotian marked this pull request as ready for review November 3, 2025 07:08
@littledgg littledgg changed the title [Deterministic] Move batch paddle version invariant pkg to Fastdeploy [Deterministic] Move paddle version batch invariant pkg to Fastdeploy Nov 3, 2025
@gongshaotian
Copy link
Collaborator

please format you code

@littledgg
Copy link
Contributor Author

please format you code

done

Copilot AI review requested due to automatic review settings November 12, 2025 08:19
Copilot finished reviewing on behalf of littledgg November 12, 2025 08:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces batch-invariant implementations of key PaddlePaddle operations (mm, addmm, log_softmax, mean) using Triton kernels to achieve deterministic inference results regardless of batch size. The implementation is adapted from the batch_invariant_ops library and integrated into FastDeploy.

  • Adds custom Triton kernel implementations for deterministic matrix operations and reduction operations
  • Provides a context manager to toggle between standard and batch-invariant modes
  • Includes comprehensive test files demonstrating batch invariance for each operation

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 18 comments.

Show a summary per file
File Description
fastdeploy/model_executor/layers/batch_invariant_ops/batch_invariant_ops.py Core implementation with Triton kernels for batch-invariant operations and mode switching functionality
fastdeploy/model_executor/layers/batch_invariant_ops/init.py Module initialization exporting public API
tests/batch_invariant/test_batch_invariance_op_mm.py Test suite for matrix multiplication batch invariance
tests/batch_invariant/test_batch_invariance_op_mean.py Test suite for mean operation batch invariance
tests/batch_invariant/test_batch_invariance_op_logsoftmax.py Test suite for log_softmax operation batch invariance
tests/batch_invariant/test_batch_invariance_op_addmm.py Test suite for addmm operation batch invariance

littledgg and others added 2 commits November 12, 2025 16:46
…ariant_ops.py


存在于原版代码注释中的版本控制遗留的内容,确实应该去除

Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants